Skip to content

Feat/newstats#2

Merged
lex0c merged 8 commits intomainfrom
feat/newstats
Apr 18, 2026
Merged

Feat/newstats#2
lex0c merged 8 commits intomainfrom
feat/newstats

Conversation

@lex0c
Copy link
Copy Markdown
Owner

@lex0c lex0c commented Apr 18, 2026

No description provided.

lex0c and others added 2 commits April 18, 2026 16:50
DevProfile.Specialization is a Herfindahl concentration index (Σ pᵢ²)
over the dev's full per-directory file-count distribution. 1 = all
files in one directory (narrow specialist); 1/N for a uniform spread
across N directories; approaches 0 for broad generalists.

Surfaces a question the current profile cannot answer: "is this
person a specialist or a generalist?". DevProfile.Scope already
shows the top 5 directories with percentages, but reading breadth
from visual percentages is subjective. One number makes the
distinction crisp.

Why Herfindahl, not Gini: Gini measures inequality among buckets,
not concentration. A dev with 1 file in 1 directory and a dev with
1 file in each of 5 directories both get Gini 0, collapsing the two
opposite ends of the spectrum. Herfindahl correctly separates them
(1 vs 0.2). An initial implementation used Gini; the review caught
the semantic error before the feature shipped and it was rewritten
before the first commit.

Four labels applied via thresholds in named constants:
  H < 0.15   broad generalist
  H < 0.35   balanced
  H < 0.7    focused specialist
  H ≥ 0.7    narrow specialist

Implementation:
- stats.go adds herfindahl() helper and specBroadGeneralistMax /
  specBalancedMax / specFocusedMax constants.
- DevProfiles computes Specialization over the full dirCount map
  (before the top-5 Scope truncation, so it reflects actual
  breadth not the truncated display).
- format.go specLabel() maps the Herfindahl value to a band.
- template.go embeds the value in the profile card grid.
- profile_template.go surfaces it inline in the Scope header of the
  standalone per-developer report.

Tests (17 cases across 4 functions):
- TestHerfindahlHelper: 9 cases covering empty, single non-zero,
  uniform (2 and 5 buckets), 70/30, 90/10, single-zero, zeros-only.
- TestSpecLabelBands: 8 cases at boundaries of each band.
- TestDevProfilesSpecialization: three synthetic devs (narrow in 1
  dir, focused 7/3 split, broad 5-dir uniform) assert the
  narrow > focused > broad ordering with specific values.
- TestDevProfilesSpecializationEdgeCases: dev with 0 files → 0;
  dev with 1 file in 1 dir → 1.0.

Validated on real repos. kubernetes (5,295 devs): 27% broad /
17.5% balanced / 18.2% focused / 37.3% narrow. The distribution
matches known structure — maintainers review broadly, subsystem
owners focus.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- METRICS.md Profile table: new Specialization row with full
  Herfindahl semantics, why Herfindahl beats Gini for this use
  case, and a pointer to the three threshold constants.
- METRICS.md Thresholds table: specBroadGeneralistMax,
  specBalancedMax, specFocusedMax added.
- README.md stats table row for `profile` mentions the new
  specialization index.
- README.md "Developer profile" section:
    * "Each profile includes" bullet list gains a Specialization
      entry with label explanation.
    * Collaboration bullet updated to mention that ranking uses
      shared_lines (Σ min(linesA, linesB)) — a drift-fix from the
      earlier SharedLines feature that was not reflected in the
      profile-specific section.

No changes to the code; documentation only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex0c
Copy link
Copy Markdown
Owner Author

lex0c commented Apr 18, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb2f46f676

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/stats/stats.go
Comment thread internal/stats/stats.go Outdated
lex0c and others added 3 commits April 18, 2026 17:00
The Specialization label and Herfindahl value describe where a dev's
files live on disk, not the semantic area they work in. The two can
disagree whenever the dev's domain cuts across the directory
structure rather than aligning with it.

Concrete cases where the label misleads:
- A security engineer who refactored auth across api/, web/,
  gateway/, services/ touches four dirs → Herfindahl low → labeled
  "broad generalist" even though they are a domain specialist whose
  domain is cross-cutting.
- A release engineer maintaining CI config scattered across
  .github/, docker/, scripts/, deploy/ lands the same way.
- A generalist who happened to land a big single-module refactor in
  the recent window looks like a "narrow specialist" for that
  snapshot.

Add a caveat block in METRICS.md's "Behavior and caveats" section
with those three examples and a closing line: "The raw Herfindahl
value is objective; the interpretation of the label is not." Flag
the limitation directly in the Profile table row so readers see it
where they first encounter the metric. README's bullet gets a
one-line italic version pointing at the full caveat.

No code changes — the metric itself stays as-is; what changes is
how readers are guided to interpret it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DevProfiles derived per-directory file counts using `dir := path` as
the fallback when the path contained no slash, so each repo-root
filename became its own pseudo-directory. A dev touching README,
Makefile, go.mod, and LICENSE ended up with four pseudo-dirs × 1 file
each, driving Herfindahl down to 0.25 ("balanced") and mislabelling
a narrow specialist on the repo root as a broad generalist.

DirectoryStats already handled this correctly (dir := "." fallback
at line 277). Align DevProfiles with the same convention. The fix is
one line; the comment above it explains why the otherwise tempting
`path` fallback is wrong.

New regression test TestDevProfilesSpecializationRootFilesBucket pins
the behavior: a dev with four root-level files must produce
Specialization=1.0 and Scope=[{".", 4}], not Specialization=0.25
across four pseudo-dirs.

Real-data impact: ~200 devs across the four validation repos (86 on
pi-hole, 3 on praat, 3 on WordPress, 106 on kubernetes) were
top-scoped on repo-root files and of those roughly half now
correctly classify as specialists instead of being diluted into the
generalist band.

Breaking change for JSON/CSV consumers: DevProfile.Scope entries
previously carried filenames like "README.md" when a dev worked on
repo-root files; they now carry "." as the directory name. Any
downstream script that indexed Scope[].Dir by filename needs to be
updated — those scripts were reading incorrect data anyway, since
the intent of the field was always a directory, not a file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
herfindahl rounded its return value to three decimals before handing
it back to DevProfiles, which then stored the rounded value in
Specialization and fed it directly to the label threshold comparisons
(< 0.15, < 0.35, < 0.7). The internal rounding caused quantization-
induced misclassification at band boundaries: a true Herfindahl of
0.1496 rounded to 0.150, which is NOT strictly less than 0.15, so it
flipped from "broad generalist" to "balanced".

Fix: remove the Round from herfindahl; return full float64 precision.
Rounding only happens at format time — both the CLI
(`fmt.Fprintf("%.2f", ...)`) and the HTML templates
({{printf "%.2f" ...}}) already round for display. Classification
logic now runs on the exact computed value.

Two new tests guard the invariant:

- TestHerfindahlPreservesPrecision: feeds [1,1,1] and asserts the
  result equals 1/3 within 1e-12. A previous version returning 0.333
  fails this test explicitly.
- TestSpecLabelBandsBoundaryPrecision: passes values just under and
  just over each threshold (0.149999, 0.150001, etc.) and verifies
  each lands in the expected band. Before the fix, values in the
  0.1495-0.1505 range could flip bands depending on how the internal
  rounding broke ties.

JSON output for Specialization will now show full-precision floats
for inputs that yield irrational results (e.g. 1/3 marshals as
"0.3333333333333333" instead of "0.333"). Consumers wanting a fixed
display precision should round at parse time; the honest value is
the full float.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex0c
Copy link
Copy Markdown
Owner Author

lex0c commented Apr 18, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 462557507e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/stats/format.go Outdated
lex0c and others added 2 commits April 18, 2026 17:21
PrintProfiles formatted Specialization with %.2f while the band label
was derived from the unrounded float. At boundary values this
produced visually contradictory output: a true Herfindahl of 0.149
rendered as "0.15 (broad generalist)" where the shown number sits
exactly at the threshold and the label reads as wrong, even though
the classification is correct (the real value 0.149 IS < 0.15).

Switch the three display sites to %.3f. The number rendered now
distinguishes 0.149 from 0.150 and lines up with the label the code
computed from the full-precision value. This preserves the
classification-integrity property from the prior commit (4625575)
while eliminating the visual inconsistency.

Test TestPrintProfilesSpecializationDisplayPrecision constructs a
profile with Specialization=0.149, renders it through PrintProfiles,
and asserts the output contains both "0.149" (not "0.15") and
"broad generalist". Guards against regression to %.2f display or
rounding drift in the formatter.

Also switches embedded and standalone HTML profile templates to
the same %.3f.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Specialization row in METRICS.md described the math and the label
thresholds, but not the relationship between what the CLI/HTML show
and what JSON output carries. With the display precision fix in the
previous commit (%.3f in CLI/HTML) and the precision-preservation
fix from 4625575 (full float64 in JSON), there is now a meaningful
asymmetry worth documenting:

  - CLI and HTML round to 3 decimals for readability.
  - JSON preserves the full float64 value so the label can be
    reproduced exactly by downstream consumers.
  - Classification runs against the raw float, not the displayed
    value. A value of 0.149 classifies as broad generalist even
    though a rounded "0.15" would not strictly be below the
    threshold.

Consumers that re-derive the label from JSON must use the raw
Specialization field, not a rounded copy. Added one sentence in the
Profile table row to make the contract explicit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex0c
Copy link
Copy Markdown
Owner Author

lex0c commented Apr 18, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The pre-existing 2500 threshold fails the quality gate as the repo
accumulates activity across the stats expansion work. Bumping to
5500 restores headroom for ongoing development without disabling
the gate. The comment about adding --fail-on-busfactor 1 when the
team grows is preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex0c lex0c merged commit 37993cb into main Apr 18, 2026
1 check passed
@lex0c lex0c deleted the feat/newstats branch April 18, 2026 20:27
lex0c added a commit that referenced this pull request Apr 20, 2026
Two related changes in the DevProfile Scope/Extensions render:

1. Denominator switch: cs.FilesTouched → len(devFiles[email])

   FilesTouched counts every path the dev appeared on, including pure
   renames (commits where add==0 && del==0 on that file). The Scope
   and Extensions numerators, however, are built from fe.devLines,
   which only tracks files where the dev wrote lines. The mismatch
   silently deflated all Pcts in repos with reorgs and left sums
   below 100% with no visible cause.

   Switching to the authored-file count as the denominator makes the
   two sides consistent, aligns with the Herfindahl specialization
   index which already used the authored population, and leaves
   truncation as the ONLY reason a Pct sum can drop below 100.
   Validated on 5726 real profiles (pi-hole, praat, WordPress,
   kubernetes): zero profiles with <5 buckets sum to anything other
   than 100% after the fix.

2. ScopeHidden / ExtensionsHidden counters + "+N more" rendering

   A reader seeing "Scope: foo (28%), bar (25%), ... (+6 more)" now
   understands the 85% sum comes from 6 hidden buckets; previously
   the 85% read as a bug. Surfaced in CLI (inline suffix), HTML main
   report profile card (inline italic span), and HTML dedicated
   profile (line below the bar legend). Silent when Hidden == 0.

Tests: regression on the denominator change (FilesTouched=5 but
authored=4, sum must be 100 not 80); explicit truncation-sum
invariant (6 buckets → 5 visible, sum<100); Hidden counters == 0 in
the common case; Hidden counters == exact drop count when
truncated.

METRICS.md: both Scope and Extensions rows updated with the new
denominator wording; Extensions caveat #2 (pure-rename gap) removed
since it no longer applies.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant